
Doc Ref # IHD-OS-KBL-Vol 8-1.17
There is a general dependency of the three operation pipelines. Semaphores are inserted either
according to frames or slices. The main CS will also be notified when the decoded reference is ready for
the next frame set to be encoded. The detailed discussion will be found in a later section.
Host software is responsible for encoding the transport stream and all the sequence, picture, and slice
layer/header in the bit-stream; the MFC system is responsible for compressing from Slice Data Layer
down to all macro-block and block layers.
Sample Algorithmic Flow
Assuming all the hardware components are given, there are infinite usage possibilities left with intention
for software to decide according to its own application needs depending upon the balanced requirement
of coding speed, frame latency, power-consumption, and video quality, and depending upon the usage
modes and user preferences (such as low-frame-rate-high-frame-quality vs. high-frame-rate-low-frame-
quality).
The last part of this chapter, we illustrate a generic sample to show how a compression algorithm can be
implemented to use our hardware.
Step 1. Application or driver initializes the encoder with desired configuration, including speed, quality,
targeted bit-rate, input video info, and output format and restrictions.
Step 2. VPP – Application or driver feeds VPP one frame at a time in coded order with specified frame or
field type, as well as transcoding information: motion vectors, coded complexity (i.e. bit size).
It will perform denoising and deblocking based on original and targeted bit-rate, and output additional
4 spatial variances and 2 temporal variances for each macroblock as well as the whole frame.
Step 3. ENC – Application or driver feeds ENC one coding slice buffer at a time including all VPP output.
The frame level data is accessible to all slices.
a. Encoding setup unit (ESE) will set picture level quality parameters (including LUTs, and other
costing functions) and set target bit-budget (TBB) and maximal bit-budget (MBB) to each
macroblock based on rate-control (RC) scheme implemented. For B-frames, it will also make ME
searching mode decision (either Fast, Slow or Uni-directional).
b. Loop over all macroblocks: calculate searching center (MVP) perform individual ME and IE (MEE).
Multi-thread may be designed for HW according to a zigzag order for minimal dependency issue.
c. ENC make microblock level code decision (CD) outputs macroblock type, intra-mode, motion-
vectors, distortions, as well as TBBs and MBBs.
Step 4. PAK – Application or driver feeds PAK one array of coded macroblocks covering a slice at a time,
including all ENC output. Original frame buffer and reconstructed reference frame buffers are also
available for PAK to access.
a. PAK may create bitstreams for all sequence, gop, picture, and slice level headers prior the first
macroblock.
b. Loop over all macroblocks, accurate prediction block is constructed for either inter- or intra-
predictions (VMC & VIP). If MB distortion is less than some predetermined threshold, for a B slice